Relevancer: Finding and Labeling Relevant Information in Tweet Collections

نویسندگان

  • Ali Hurriyetoglu
  • Christian Gudehus
  • Nelleke Oostdijk
  • Antal van den Bosch
چکیده

We introduce a tool that supports knowledge workers who want to gain insights from a tweet collection, but due to time constraints cannot go over all tweets. Our system first pre-processes, de-duplicates, and clusters the tweets. The detected clusters are presented to the expert as so-called information threads. Subsequently, based on the information thread labels provided by the expert, a classifier is trained that can be used to classify additional tweets. As a case study, the tool is evaluated on a tweet collection based on the key terms ‘genocide’ and ‘Rohingya’. The average precision and recall of the classifier on six classes is 0.83 and 0.82 respectively. At this level of performance, experts can use the tool to manage tweet collections efficiently without missing much information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Active Tweet Recommendation Based on User Interest Profiles

The rapid growth of Twitter has made it one of the most popular information sources of current affairs. Twitter users gather information about their topics of interest through their followees’ posts or by searching for relevant posts. However, users are often overwhelmed by the large number of tweets which makes it difficult for them to find relevant and non-redundant information about their in...

متن کامل

Open Archive Toulouse Archive Ouverte (oatao) Overview of Inex 2014

INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2014 evaluation campaign, which consisted of three tracks: The Interactive Social Book Search Track investigated user information seeking behavior when intera...

متن کامل

Overview of INEX 2014

INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2014 evaluation campaign, which consisted of three tracks: The Interactive Social Book Search Track investigated user information seeking behavior when intera...

متن کامل

Collective Semantic Role Labeling for Tweets with Clustering

As tweets have become a comprehensive repository of fresh information, Semantic Role Labeling (SRL) for tweets has aroused great research interests because of its central role in a wide range of tweet related studies such as fine-grained information extraction, sentiment analysis and summarization. However, the fact that a tweet is often too short and informal to provide sufficient information ...

متن کامل

QuickView: NLP-based Tweet Search

Tweets have become a comprehensive repository for real-time information. However, it is often hard for users to quickly get information they are interested in from tweets, owing to the sheer volume of tweets as well as their noisy and informal nature. We present QuickView, an NLP-based tweet search platform to tackle this issue. Specifically, it exploits a series of natural language processing ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016